估计大规模森林AGB和精细的空间决议对于温室气体会计,监测和验证工作以减轻气候变化的范围变得越来越重要。机载LiDAR对于在包括AGB在内的森林结构的属性建模非常有价值,但大多数LiDAR收集都发生在涵盖不规则,不连续的足迹的本地或区域尺度上,导致不同景观细分市场在各个时间点进行拼布。在这里,作为纽约州(美国)全州森林碳评估的一部分,我们解决了利用激光雷达拼布在景观尺度上的雷达拼凑而成的障碍,包括选择培训数据,对预测的区域或覆盖范围的特定模式的调查错误,并绘制与多个量表的现场清单一致。三种机器学习算法和一个集合模型经过FIA场测量,空气传播的激光雷达和地形,气候和心形地理训练。使用一组严格的地块选择标准,选择了801个FIA图,并从17个叶子覆盖范围(2014-2019)的拼布中绘制的共同定位的点云(2014-2019)。我们的合奏模型用于在预测定义的适用性区域(占激光雷达覆盖率的98%)内生成30 m AGB的预测表面,并将所得的AGB图与FIA绘图级别和面积估计值进行比较。我们的模型总体准确(%RMSE 22-45%; MAE 11.6-29.4 mg ha $^{ - 1} $; me 2.4-6.3 mg ha $^{ - 1} $),解释了73-80%的领域 - 观察到的变化,并得出与FIA基于设计的估计值一致的估计值(FIA 95%CI中的估计值的89%)。我们分享实用的解决方案,以使用LIDAR的时空拼布面临的挑战来满足不断增长的AGB映射需求,以支持森林碳会计和生态系统中的应用。
translated by 谷歌翻译
我们引入了构图软提示(CSP),这是一种参数有效的学习技术,可改善大规模预处理视觉模型(VLMS)的零摄像组成性。 VLM可以在其灵活的文本编码器中代表任意类作为自然语言提示,但在组成零击基准任务上的表现不佳。为了改善VLM,我们提出了一种新颖的软提示形式。我们将构成的属性和对象视为将类定义为词汇的可学习令牌,并在多个及时的构图上调整它们。在推断期间,我们在新组合中重新组装了学习的属性对象词汇。我们表明,CSP在基准数据集上的原始VLM的表现平均为AUC上的10.9个百分点。 CSP还胜过Coop,这是一种调谐前缀上下文的软提示方法,在AUC上平均要点5.8个百分点。我们执行其他实验,以表明CSP对仅属性分类,高阶属性 - 属性对象组成以及预验证属性和微调对象的组合进行了改进。
translated by 谷歌翻译
变压器语言模型的大规模自我监督的预培训已经推进了自然语言处理领域,并在跨申请中显示了蛋白质和DNA的生物“语言”的承诺。学习使用大型基因组序列的DNA序列的有效表示可以通过转移学习加速基因调控模型的发展。然而,为了精确模拟特异性细胞类型的基因调节和功能,不仅需要考虑DNA核苷酸序列中包含的信息,这主要是细胞类型之间的不变性,还要考虑局部化学和结构“表观遗传状态”染色体在细胞类型之间变化。这里,我们引入来自变压器(BERT)模型的双向编码器表示,该模型基于DNA序列和配对的表观遗传状态输入来学习表示,我们称之为表观脑栓(或ebert)。我们在整个人类基因组中使用蒙面语言模型目标以及跨越127种细胞类型预先列车。通过与脑系统的合作伙伴关系,第一次培训这种复杂模型,首次通过与脑系统合作,其CS-1系统提供所有预训练实验。我们通过展示细胞类型特定的转录因子绑定预测任务的强大性能来显示Ebert的转移学习潜力。我们的微调模型超过了来自编码梦想基准的13个评估数据集中的4个艺术表现的状态,并在挑战排行榜上获得3号的整体排名。我们探讨了表观遗传数据和任务特定功能增强的如何纳入影响转移学习绩效。
translated by 谷歌翻译
机器学习从业者通常可以访问数据的频谱:目标任务(通常是有限),未标记的数据和辅助数据的标记数据,用于其他任务的许多可用标记的数据集。我们描述了TAGLET,一个系统为学习技术,用于自动利用所有三种类型的数据并创建高质量的可服装分类器。 TAGLET的关键组件是:(1)根据知识图组织组织的辅助数据,(2)封装用于利用辅助和未标记数据的不同方法的模块,以及(3)将被整合模块组合成可用的蒸馏阶段模型。我们将TAGLETS与最先进的传输学习和半监督学习方法进行比较,四个图像分类任务。我们的研究涵盖了一系列设置,改变了标记数据的量和辅助数据的语义相关性到目标任务。我们发现,辅助和未标记数据的智能融合到多个学习技术使Taglet能够匹配 - 并且最常见的是这些替代方案。 Taglets可作为Github.com/batsresearch/taglet的开源系统使用。
translated by 谷歌翻译
脑转移性疾病的治疗决策依赖于主要器官位点的知识,目前用活组织检查和组织学进行。在这里,我们开发了一种具有全脑MRI数据的准确非侵入性数字组织学的新型深度学习方法。我们的IRB批准的单网回顾性研究由患者(n = 1,399)组成,提及MRI治疗规划和伽马刀放射牢房超过19年。对比增强的T1加权和T2加权流体减毒的反转恢复脑MRI考试(n = 1,582)被预处理,并输入肿瘤细分,模态转移和主要部位分类的建议深度学习工作流程为五个课程之一(肺,乳腺,黑色素瘤,肾等)。十倍的交叉验证产生的总体AUC为0.947(95%CI:0.938,0.955),肺类AUC,0.899(95%CI:0.884,0.915),乳房类AUC为0.990(95%CI:0.983,0.997) ,黑色素瘤ACAC为0.882(95%CI:0.858,0.906),肾类AUC为0.870(95%CI:0.823,0.918),以及0.885的其他AUC(95%CI:0.843,0.949)。这些数据确定全脑成像特征是判别的,以便准确诊断恶性肿瘤的主要器官位点。我们的端到端深度射出方法具有巨大的分类来自全脑MRI图像的转移性肿瘤类型。进一步的细化可以提供一种无价的临床工具,以加快对精密治疗和改进的结果的原发性癌症现场鉴定。
translated by 谷歌翻译
零拍的学习依赖于语义类表示,例如手工设计的属性或学习的嵌入方式来预测类,而无需任何标记的示例。我们建议通过将节点从矢量空间中的常识知识图中嵌入节点来学习班级表示。常识知识图是未开发的明确高级知识的来源,几乎不需要人类的努力才能应用于一系列任务。为了捕获图中的知识,我们引入了ZSL-KG,这是一种具有新型变压器图卷积网络(TRGCN)的通用框架,用于生成类表示。我们提出的TRGCN体系结构计算节点社区的非线性组合。我们的结果表明,ZSL-KG在语言和视觉中的六个零弹药基准数据集中有五个基于WordNet的方法改进了基于WordNet的方法。
translated by 谷歌翻译
In this paper we explore the task of modeling (semi) structured object sequences; in particular we focus our attention on the problem of developing a structure-aware input representation for such sequences. In such sequences, we assume that each structured object is represented by a set of key-value pairs which encode the attributes of the structured object. Given a universe of keys, a sequence of structured objects can then be viewed as an evolution of the values for each key, over time. We encode and construct a sequential representation using the values for a particular key (Temporal Value Modeling - TVM) and then self-attend over the set of key-conditioned value sequences to a create a representation of the structured object sequence (Key Aggregation - KA). We pre-train and fine-tune the two components independently and present an innovative training schedule that interleaves the training of both modules with shared attention heads. We find that this iterative two part-training results in better performance than a unified network with hierarchical encoding as well as over, other methods that use a {\em record-view} representation of the sequence \cite{de2021transformers4rec} or a simple {\em flattened} representation of the sequence. We conduct experiments using real-world data to demonstrate the advantage of interleaving TVM-KA on multiple tasks and detailed ablation studies motivating our modeling choices. We find that our approach performs better than flattening sequence objects and also allows us to operate on significantly larger sequences than existing methods.
translated by 谷歌翻译
Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.
translated by 谷歌翻译
Real-life tools for decision-making in many critical domains are based on ranking results. With the increasing awareness of algorithmic fairness, recent works have presented measures for fairness in ranking. Many of those definitions consider the representation of different ``protected groups'', in the top-$k$ ranked items, for any reasonable $k$. Given the protected groups, confirming algorithmic fairness is a simple task. However, the groups' definitions may be unknown in advance. In this paper, we study the problem of detecting groups with biased representation in the top-$k$ ranked items, eliminating the need to pre-define protected groups. The number of such groups possible can be exponential, making the problem hard. We propose efficient search algorithms for two different fairness measures: global representation bounds, and proportional representation. Then we propose a method to explain the bias in the representations of groups utilizing the notion of Shapley values. We conclude with an experimental study, showing the scalability of our approach and demonstrating the usefulness of the proposed algorithms.
translated by 谷歌翻译
The previous fine-grained datasets mainly focus on classification and are often captured in a controlled setup, with the camera focusing on the objects. We introduce the first Fine-Grained Vehicle Detection (FGVD) dataset in the wild, captured from a moving camera mounted on a car. It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy. While previous classification datasets also include makes for different kinds of cars, the FGVD dataset introduces new class labels for categorizing two-wheelers, autorickshaws, and trucks. The FGVD dataset is challenging as it has vehicles in complex traffic scenarios with intra-class and inter-class variations in types, scale, pose, occlusion, and lighting conditions. The current object detectors like yolov5 and faster RCNN perform poorly on our dataset due to a lack of hierarchical modeling. Along with providing baseline results for existing object detectors on FGVD Dataset, we also present the results of a combination of an existing detector and the recent Hierarchical Residual Network (HRN) classifier for the FGVD task. Finally, we show that FGVD vehicle images are the most challenging to classify among the fine-grained datasets.
translated by 谷歌翻译